24 research outputs found
Recommended from our members
Identifying Justifications in Written Dialogs By Classifying Text as Argumentative
In written dialog, discourse participants need to justify claims they make, to convince the reader the claim is true and/or relevant to the discourse. This paper presents a new task (with an associated corpus), namely detecting such justifications. We investigate the nature of such justifications, and observe that the justifications themselves often contain discourse structure. We therefore develop a method to detect the existence of certain types of discourse relations, which helps us classify whether a segment is a justification or not. Our task is novel, and our work is novel in that it uses a large set of connectives (which we call indicators), and in that it uses a large set of discourse relations, without choosing among them
Recommended from our members
Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation
We present a reformulation of the word pair features typically used for the task of disambiguating implicit relations in the Penn Discourse Treebank. Our word pair features achieve significantly higher performance than the previous formulation when evaluated without additional features. In addition, we present results for a full system using additional features which achieves close to state of the art performance without resorting to gold syntactic parses or to context outside the relation
Identifying Justifications in Written Dialogs
Abstract—In written dialog, discourse participants need to justify claims they make, to convince the reader the claim is true and/or relevant to the discourse. This paper presents a new task (with an associated corpus), namely detecting such justifications. We investigate the nature of such justifications, and observe that the justifications themselves often contain discourse structure. We therefore develop a method to detect the existence of certain types of discourse relations, which helps us classify whether a segment is a justification or not. Our task is novel, and our work is novel in that it uses a large set of connectives (which we call indicators), and in that it uses a large set of discourse relations,
without choosing among them
Recommended from our members
Data-Driven Solutions to Bottlenecks in Natural Language Generation
Concept-to-text generation suffers from what can be called generation bottlenecks - aspects of the generated text which should change for different subject domains, and which are usually hard to obtain or require manual work. Some examples are domain-specific content, a type system, a dictionary, discourse style and lexical style. These bottlenecks have stifled attempts to create generation systems that are generic, or at least apply to a wide range of domains in non-trivial applications.
This thesis is comprised of two parts. In the first, we propose data-driven solutions that automate obtaining the information and models required to solve some of these bottlenecks. Specifically, we present an approach to mining domain-specific paraphrasal templates from a simple text corpus; an approach to extracting a domain-specific taxonomic thesaurus from Wikipedia; and a novel document planning model which determines both ordering and discourse relations, and which can be extracted from a domain corpus. We evaluate each solution individually and independently from its ultimate use in generation, and show significant improvements in each.
In the second part of the thesis, we describe a framework for creating generation systems that rely on these solutions, as well as on hybrid concept-to-text and text-to-text generation, and which can be automatically adapted to any domain using only a domain-specific corpus. We illustrate the breadth of applications that this framework applies to with three examples: biography generation and company description generation, which we use to evaluate the framework itself and the contribution of our solutions; and justification of machine learning predictions, a novel application which we evaluate in a task-based study to show its importance to users
Interactive hybrid approach to combine machine and human intelligence for personalized rehabilitation assessment
Automated assessment of rehabilitation exercises using machine
learning has a potential to improve current rehabilitation practices.
However, it is challenging to completely replicate therapist’s deci sion making on the assessment of patients with various physical
conditions. This paper describes an interactive machine learning
approach that iteratively integrates a data-driven model with ex pert’s knowledge to assess the quality of rehabilitation exercises.
Among a large set of kinematic features of the exercise motions, our
approach identifies the most salient features for assessment using
reinforcement learning and generates a user-specific analysis to
elicit feature relevance from a therapist for personalized rehabilita tion assessment. While accommodating therapist’s feedback on fea ture relevance, our approach can tune a generic assessment model
into a personalized model. Specifically, our approach improves
performance to predict assessment from 0.8279 to 0.9116 average
F1-scores of three upper-limb rehabilitation exercises ( < 0.01).
Our work demonstrates that machine learning models with feature
selection can generate kinematic feature-based analysis as expla nations on predictions of a model to elicit expert’s knowledge of
assessment, and how machine learning models can augment with
expert’s knowledge for personalized rehabilitation assessment.info:eu-repo/semantics/publishedVersio
Actionable Recourse in Linear Classification
Machine learning models are increasingly used to automate decisions that
affect humans - deciding who should receive a loan, a job interview, or a
social service. In such applications, a person should have the ability to
change the decision of a model. When a person is denied a loan by a credit
score, for example, they should be able to alter its input variables in a way
that guarantees approval. Otherwise, they will be denied the loan as long as
the model is deployed. More importantly, they will lack the ability to
influence a decision that affects their livelihood.
In this paper, we frame these issues in terms of recourse, which we define as
the ability of a person to change the decision of a model by altering
actionable input variables (e.g., income vs. age or marital status). We present
integer programming tools to ensure recourse in linear classification problems
without interfering in model development. We demonstrate how our tools can
inform stakeholders through experiments on credit scoring problems. Our results
show that recourse can be significantly affected by standard practices in model
development, and motivate the need to evaluate recourse in practice.Comment: Extended version. ACM Conference on Fairness, Accountability and
Transparency [FAT2019
Predicting the impact of scientific concepts using full‐text features
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134425/1/asi23612.pd
Data-driven sentence simplification: Survey and benchmark
Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read and understand. In order to do so, several rewriting transformations can be performed such as replacement, reordering, and splitting. Executing these transformations while keeping sentences grammatical, preserving their main idea, and generating simpler output, is a challenging and still far from solved problem. In this article, we survey research on SS, focusing on approaches that attempt to learn how to simplify using corpora of aligned original-simplified sentence pairs in English, which is the dominant paradigm nowadays. We also include a benchmark of different approaches on common datasets so as to compare them and highlight their strengths and limitations. We expect that this survey will serve as a starting point for researchers interested in the task and help spark new ideas for future developments